Goto

Collaborating Authors

 efficient channel attention


SLAM in the Dark: Self-Supervised Learning of Pose, Depth and Loop-Closure from Thermal Images

Xu, Yangfan, Hao, Qu, Zhang, Lilian, Mao, Jun, He, Xiaofeng, Wu, Wenqi, Chen, Changhao

arXiv.org Artificial Intelligence

SLAM in the Dark: Self-Supervised Learning of Pose, Depth and Loop-Closure from Thermal Images Y angfan Xu, Qu Hao, Lilian Zhang, Jun Mao, Xiaofeng He, Wenqi Wu, Changhao Chen* Abstract -- Visual SLAM is essential for mobile robots, drone navigation, and VR/AR, but traditional RGB camera systems struggle in low-light conditions, driving interest in thermal SLAM, which excels in such environments. However, thermal imaging faces challenges like low contrast, high noise, and limited large-scale annotated datasets, restricting the use of deep learning in outdoor scenarios. We present DarkSLAM, a noval deep learning-based monocular thermal SLAM system designed for large-scale localization and reconstruction in complex lighting conditions.Our approach incorporates the Efficient Channel Attention (ECA) mechanism in visual odometry and the Selective Kernel Attention (SKA) mechanism in depth estimation to enhance pose accuracy and mitigate thermal depth degradation. Additionally, the system includes thermal depth-based loop closure detection and pose optimization, ensuring robust performance in low-texture thermal scenes. Extensive outdoor experiments demonstrate that DarkSLAM significantly outperforms existing methods like SC-Sfm-Learner and Shin et al., delivering precise localization and 3D dense mapping even in challenging nighttime environments. I. INTRODUCTION Simultaneous Localization and Mapping (SLAM) is crucial for intelligent systems from mobile robots, drones, to self-driving vehicles, enabling their real-time localization and mapping for autonomous navigation. Traditional visual SLAM systems, which rely on visible-light (RGB) cameras, struggle in challenging lighting conditions such as strong light, shadows, or nighttime, limiting their use in all-time scenarios. Thermal cameras, which detect heat radiation, offer a solution by functioning in darkness, smoke, and dust, complementing visible-light sensors. Recent improvements in thermal camera resolution and sensitivity have increased their reliability in autonomous systems.


MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

Zhang, Chao, Cui, Weirong, Guo, Jingjing

arXiv.org Artificial Intelligence

Monitoring sleep states is essential for evaluating sleep quality and diagnosing sleep disorders. Traditional manual staging is time-consuming and prone to subjective bias, often resulting in inconsistent outcomes. Here, we developed an automated model for sleep staging and disorder classification to enhance diagnostic accuracy and efficiency. Considering the characteristics of polysomnography (PSG) multi-lead sleep monitoring, we designed a multimodal sleep state classification model, MSSC-BiMamba, that combines an Efficient Channel Attention (ECA) mechanism with a Bidirectional State Space Model (BSSM). The ECA module allows for weighting data from different sensor channels, thereby amplifying the influence of diverse sensor inputs. Additionally, the implementation of bidirectional Mamba (BiMamba) enables the model to effectively capture the multidimensional features and long-range dependencies of PSG data. The developed model demonstrated impressive performance on sleep stage classification tasks on both the ISRUC-S3 and ISRUC-S1 datasets, respectively containing data with healthy and unhealthy sleep patterns. Also, the model exhibited a high accuracy for sleep health prediction when evaluated on a combined dataset consisting of ISRUC and Sleep-EDF. Our model, which can effectively handle diverse sleep conditions, is the first to apply BiMamba to sleep staging with multimodal PSG data, showing substantial gains in computational and memory efficiency over traditional Transformer-style models. This method enhances sleep health management by making monitoring more accessible and extending advanced healthcare through innovative technology.


Brief Review -- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

#artificialintelligence

ECA-Net clearly outperforms SENet, and also outperforms fixed kernel version of ECA-Net. ECA-Net is superior to SENet and CBAM while it is very competitive to AA-Net with lower model complexity. Note that AA-Net is trained with Inception data augmentation and different setting of learning rates. ECA-Net performs favorably against state-of-the-art CNNs while benefiting much lower model complexity. Different frameworks are used, ECA-Net can well generalize to object detection task.


論文閱讀 CVPR 2020 -- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

#artificialintelligence

目前在電腦視覺的領域上,具備深度的卷積神經網路已經在多種任務上取得很大的進步。近年在 Convolution block 的設計上,有發展出一系列相當有潛力的研究,主要是探討將 Attention 機制應用於 Channel 維度的作法。 其中相當具代表性的研究是 SENet,後續也有一些研究嘗試利用更複雜的機制捕捉 Channel-wise dependencies ,或結合 Spatial…